A new kernel for SVM MLLR based speaker recognition
نویسندگان
چکیده
Speaker recognition using support vector machines (SVMs) with features derived from generative models has been shown to perform well. Typically, a universal background model (UBM) is adapted to each utterance yielding a set of features that are used in an SVM. We consider the case where the UBM is a Gaussian mixture model (GMM), and maximum likelihood linear regression (MLLR) adaptation is used to adapt the means of the UBM. We examine two possible SVM feature expansions that arise in this context: the first, a GMM supervector is constructed by stacking the means of the adapted GMM, and the second consists of the elements of the MLLR transform. We examine several kernels associated with these expansions. We show that both expansions are equivalent given an appropriate choice of kernels. Experiments performed on the NIST SRE 2006 corpus clearly highlight that our choice of kernels, which are motivated by distance metrics between GMMs, outperform ad-hoc ones. We also show that by using multi-class MLLR adaptation we can approach state of the art performance.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملAn anticorrelation kernel for improved system combination in speaker verification
This paper presents a method for training SVM-based classification systems for combination with other existing classification systems designed for the same task. Ideally, a new system should be designed such that, when combined with the existing systems, the resulting performance is optimized. To achieve this goal, we include a regularization term in the SVM objective function that aims to redu...
متن کاملMLLR transforms as features in speaker recognition
We explore the use of adaptation transforms employed in speech recognition systems as features for speaker recognition. This approach is attractive because, unlike standard framebased cepstral speaker recognition models, it normalizes for the choice of spoken words in text-independent speaker verification. Affine transforms are computed for the Gaussian means of the acoustic models used in a re...
متن کاملFactor Analysis Back Ends for MLLR Transforms in Speaker Recognition
The purpose of this work is to show how recent developments in cepstral-based systems for speaker recognition can be leveraged for the use of Maximum Likelihood Linear Regression (MLLR) transforms. Speaker recognition systems based on MLLR transforms have shown to be greatly beneficial in combination with standard systems, but most of the advances in speaker modeling techniques have been implem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007